14 research outputs found

    Trading off Energy versus Accuracy in Modern Computing Systems:From Digital Circuit Design to Programming Techniques

    Get PDF
    The slowdown of Moore's law, which has been the driving force of the electronics industry over the last 5 decades, is causing serious problem to Integrated Circuits (ICs) improvements. Technology scaling is becoming more and more complex and fabrication costs are growing exponentially. Furthermore, the energy gains associated to technology scaling are slowing down. Meanwhile, the expected boom of Internet of Things (IoT) devices requires ultra-low power ICs to be able to operate for several years without any user intervention, and energy-efficient computing system on the server side to treat all the gathered data. Approximate computing has emerged as an alternative way to improve energy-efficiency of both, high-performance and low-power computing systems by tolerating small and occasional errors. This energy-accuracy tradeoff can be applied on a wide range of over-engineered applications, particularly those involving human senses such as video and image processing. This thesis first presents an approximate circuit design technique called Gate-Level Pruning, which consists in selectively removing logic gates from any conventional circuit in order to reduce energy consumption, critical path delay, and area occupied on silicon. A Computer Aided Design (CAD) tool has been developed and integrated in the standard digital flow and has been evaluated on several arithmetic circuits, achieving up to 78% energy-delay-area savings. It is then shown how this methodology can be applied on more complex systems made of multiple arithmetic blocks but also memory: the discrete Cosine Transform(DCT), which is a key building block for image and video processing applications. Then, the speculative adder technique is presented. It consists in cutting carry chains to significantly relax the circuit timing constraints', and therefore drastically reduce energy consumption, area and delay. It is shown that this technique leads to errors of different nature than those produced by gate-level pruning. It is therefore worth combining GLP and speculative adders to obtain even higher savings. This has been verified on IEEE-754 floating point units integrated in a 65nm process within a low-power multi-core processor. Silicon measurements show up to 27% power, 36% area and 53% power-area savings. The second part of this thesis introduces software techniques to achieve similar energy-accuracy tradeoffs on commercially available processors. By switching from double precision to single precision floating-point data type and by exploiting vectorization capabilities of modern processors, a factor 2 energy can be saved on a Newton method for solving nonlinear equations. To further investigate the origins of these savings, an energy model based on Energy Per Instructions (EPI) has been built. It turns out that less than 6% of the total energy is consumed by arithmetic operations and that savings are achieved mainly by reducing the amount of data transferred between registers, cache and main memory. One way to reduce those power-hungry data movements is to use application specific hardware accelerators. Unfortunately, a commercial processor cannot embark accelerators for all the possible applications. To that extent, hardware accelerators are implemented on a Field Programmable Gate Array (FPGA) interconnected with a general-purpose processor to further reduce the energy consumption

    Near/Sub-Threshold Circuits and Approximate Computing: The Perfect Combination for Ultra-Low-Power Systems

    Get PDF
    While sub/near-threshold design offers the minimal power and energy consumption, such approach strongly deteriorates circuit performances and robustness against PVT (process/voltage/temperature) variations, leading to gigantic speed penalties and large silicon areas. Inexact and approximate circuit design can address these issues by trading calculation accuracy for better silicon area, circuit speed and even better power consumption. This paper reviews and proposes improvements for two approximate computing techniques applicable to arithmetic circuits: gate-level pruning and carry speculation. A critical study is then carried out considering several error metrics, and for the first time, those techniques are combined to produce approximate adders showing even higher gains at similar error levels. It is then shown that those techniques can be applied to a sub-threshold library to mitigate the large variability

    Energy-Efficient Digital Design Through Inexact and Approximate Arithmetic Circuits

    Get PDF
    Inexact and approximate circuit design is a promising approach to improve performance and energy efficiency in technology-scaled and low-power digital systems. Such strategy is suitable for error tolerant applications involving perceptive or statistical outputs. This paper reviews two established techniques applicable to arithmetic units: circuit pruning and carry speculation. A critical comparative study is carried out considering several error metrics

    Design of Energy-Efficient Discrete Cosine Transform using Pruned Arithmetic Circuits

    Get PDF
    Inexact circuits and approximate computing have been gaining a lot of interest in order to improve performances and energy efficiency beyond the boundaries of conventional digital circuits. Image and video processing is one of the best candidate for applying such techniques. As one of the key building blocks, Discrete Cosine Transform (DCT) accelerators are investigated using pruned arithmetic circuits. A design methodology is presented in order to optimize both image quality and circuit performances. This work demonstrates that with such technique, savings are possible not only on arithmetic units, but in the entire accelerator hardware. Simulations show up to 12\% area and 10\% power savings with less than 20dB PSNR degradation compared to the conventional DCT design

    Poster: Energy-Efficient Inexact Speculative Adder with High Performance and Accuracy Control

    Get PDF
    Approximate circuit design is a promising approach to improve performances and energy efficiency beyond the boundaries of conventional digital circuits. Such strategy is suitable for error-tolerant applications involving perceptive or statistical outputs. This work presents a novel architecture of an Inexact Speculative Adder for high-performance applications with optimized hardware and advanced error compensation technique. This adder allows precise tuning of accuracy to match design specifications while optimizing performances and energy efficiency. It demonstrates power savings up to 26% and energy-delay-area reductions up to 60% compared to the state-of-the-art

    Design and Applications of Approximate Circuits by Gate-Level Pruning

    Get PDF
    Energy-efficiency is a critical concern for many systems, ranging from Internet of things objects and mobile devices to high-performance computers. Moreover, after 40 years of prosperity, Moore's law is starting to show its economic and technical limits. Noticing that many circuits are over-engineered and that many applications are error-resilient or require less precision than offered by the existing hardware, approximate computing has emerged as a potential solution to pursue improvements of digital circuits. In this regard, a technique to systematically tradeoff accuracy in exchange for area, power, and delay savings in digital circuits is proposed: gate-level pruning (GLP). A CAD tool is build and integrated into a standard digital flow to offer a wide range of cost-accuracy tradeoffs for any conventional design. The methodology is first demonstrated on adders, achieving up to 78% energy-delay-area reduction for 10% mean relative error. It is then detailed how this methodology can be applied on a more complex system composed of a multitude of arithmetic blocks and memory: the discrete cosine transform (DCT), which is a key building block for image and video processing applications. Even though arithmetic circuits represent less than 4% of the entire DCT area, it is shown that the GLP technique can lead to 21% energy-delay-area savings over the entire system for a reasonable image quality loss of 24 dB. This significant saving is achieved thanks to the pruned arithmetic circuits, which sets some nodes at constant values, enabling the synthesis tool to further simplify the circuit and memory

    Design of Approximate Circuits by Fabrication of False Timing Paths: The Carry Cut-Back Adder

    Get PDF
    This paper introduces a novel method for designing approximate circuits by fabricating and exploiting false timing paths, i.e. critical paths that cannot be logically activated. This allows to strongly relax timing constraints while guaranteeing minimal and controlled behavioral change. This technique is applied to an approximate adder architecture, called the Carry Cut-Back Adder (CCBA), in which high-significance stages can cut the carry propagation chain at lower-significance positions. This lightweight approach prevents the logic activation of the carry chain, improving performance and energy efficiency while guaranteeing low worst-case errors. A design methodology is presented along with implementation, error optimization and design-space minimization. The CCBA is proven capable of extremely high accuracy while displaying significant circuit savings. For a worst-case precision of 99.999%, energy savings up to 36% are demonstrated compared to exact adders. Finally, an industry-oriented comparison of 32-bit approximate and truncated adders is carried out for mean and worst-case relative errors. The CCBA outperforms both state-of-the-art and truncated adders for high-accuracy and low-power circuits, confirming the interest of the proposed concept to help building highly-efficient approximate or precision-scalable hardware accelerators

    Automatic Generation of Inexact Digital Circuits by Gate-level Pruning

    Get PDF
    Inexact or approximate circuits show great ability to reduce power consumption at the cost of occasional errors in comparison to their conventional counterparts. Even though the benefits of such circuits have been proven for many applications, they are not wide spread owing to the absence of a clear design methodology and the required CAD tools. In this regard, this paper presents a methodology to automatically generate inexact circuits starting from a conventional design by adding only one small step in the digital design flow. Further, this paper also demonstrates that achieving pruning at gate-level can lead to substantial savings in terms of power consumption, critical path delay and silicon area. An order of magnitude area and power savings is demonstrated for a 64-bit gate level pruned high-speed adder for a 10% relative error magnitude

    Approximate 32-Bit Floating-Point Unit Design with 53% Power-Area Product Reduction

    Get PDF
    The floating-point unit is one of the most common building block in any computing system and is used for a huge number of applications. By combining two state-of-the-art techniques of imprecise hardware, namely Gate-Level Pruning and Inexact Speculative Adder, and by introducing a novel Inexact Speculative Multiplier architecture, three different approximate FPUs and one reference IEEE-754 compliant FPU have been integrated in a 65 nm CMOS process within a low-power multi-core processor. Silicon measurements show up to 27% power, 36% area and 53%power-area product savings compared to the IEEE-754 single-precision FPU. Accuracy loss has been evaluated with a high-dynamic-range image tone-mapping algorithm, resulting in small but non-visible errors with image PSNR value of 90 dB

    Cross-Layer Inexact Design for Low-Power Applications

    Get PDF
    Approximate and error tolerant circuits are a radical new approach to trade calculation accuracy for better speed, power, area and yield. The IcySoC project platform revisits low-power and low-voltage VLSI design through a cross-layer combined inexact design framework
    corecore